-
Notifications
You must be signed in to change notification settings - Fork 808
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
perf(pipeline): Improve execution times for dense pipeline graphs #4824
Conversation
.collect(toList()); | ||
List<StageExecution> syntheticStages = | ||
stage.getExecution().getStages().stream() | ||
.filter(s -> s.getSyntheticStageOwner() != null) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was it really going through all the stages before and not only the STAGE_BEFORE/AFTER?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes :(
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is a good catch 🚀 👏
when given a complex pipeline with multiple layers of upstream stages
This turns the anyUpstreamStagesFailed calculation from one that scales exponentially based on the number of (branches+downstream stages) in a pipeline to one that scales linearly based on the total number of stages in a pipeline. This is a significant performance improvement, especially for very large and complicated pipelines
since that's the only place where it gets used
for memoization. The recursive anyUpstreamStagesFailed(StageExecution) function runs in a single thread, so ConcurrentHashMap is not necessary here
before checking for parent stages. stage.getRequisiteStageRefIds is a more expensive call because the underlying implementation creates a copy of a List. Therefore, start with the cheaper operation first hoping to short-circuit and avoid the more expensive check
The javadocs state that the syntheticStageOwner property is null for non-synthetic stages. Use this information to filter out non-synthetic stages before attempting a potentially expensive operation to check for synthetic parents of previousStages
StageExecutionImpl.getRequisiteStageRefIds() returns a copy of a Set. This is a costly operation that has the potential to get repeated for every unvisited stage. To avoid this, compute the value before entering a loop
withAuth is only necessary when starting a stage. Since withAuth is very computationally expensive for complex pipelines, only call it when it is absolutely necessary.
StartStageHandler already makes a call to message.withStage at the beginning of the handle() method. Therefore, this call within the catch block is unnecessary
7dbef46
to
72fb2bf
Compare
There's a number of performance improvements here:
Overall, these improvements reduce pipeline execution times across the board, with the biggest gains seen from very dense pipeline graphs such as the contrived example in ComplexPipeline.kt. After the optimizations, the timing of the test added to StartStageHandler improves from never completing to finishing in ~160ms on my machine.